Lecture 2

Fitting and Choosing Models

Back to the temperature series

load("data/meteo.RData")
# plot data:
plot(T.outside~date,meteo,type='l')

Removing the diurnal periodicity

Assuming this is a sinus function, \[\alpha_1 + \alpha_2 \sin(t + \alpha_3)\], we need non-linear regression (function is non-linear in \(\alpha_3\))

attach(meteo) # so we can use variable names directly:
f = function(x) sum((T.outside - (x[1]+x[2]*sin(pi*(hours+x[3])/12)))^2)
nlm(f,c(0,0,0))
$minimum
[1] 108956.1

$estimate
[1] 18.189544 -4.904740  1.604442

$gradient
[1] -1.600031e-06 -9.790799e-05  1.904651e-04

$code
[1] 1

$iterations
[1] 9

Removing the diurnal periodicity

# plot data and trend:
plot(T.outside~date,meteo,type='l')
meteo$T.per = 18.2-4.9*sin(pi*(meteo$hours+1.6)/12)
lines(T.per~date,meteo,col='red')

Temperature anomaly

plot(T.outside-T.per~date, meteo, type='l')
title("anomaly")

Temperature anomaly

  • Can we also make estimates of this anomaly?

Temperature anomaly

What can we do with such models?

  • try to find out which model fits best (model selection)
  • learn how they were/could have been generated
  • predict future observations (estimation/prediction/forecasting)
  • generate similar data ourselves (simulation)

How to select a “best” model?

A possible approach is to find the minimum for Akaike’s Information Criterion (AIC) for ARMA(\(p,q\)) models and series of length \(n\): \[AIC = \log \hat{\sigma}^2 + 2(p+q+1)/n\] with \(\hat{\sigma}^2\) the estimated residual (noise) variance.

Instead of finding a single best model using this single criterion, it may be better to select a small group of “best” models, and look at model diagnostics for each: is the residual white noise? does it have stationary variance?

Even better may be to keep a number of “fit” models and consider each as (equally?) suitable candidates.

AIC for AR(p), and as as a function of \(p\)

 [1] -23547.93 -30235.42 -30713.51 -30772.31 -30815.14 -30816.35 -30818.27
 [8] -30818.39 -30817.82 -30815.84

Anomaly AIC as a function of \(p\), for AR(p)

 [1] -23994.90 -30320.26 -30843.40 -30885.36 -30945.22 -30943.51 -30952.06
 [8] -30957.84 -30955.86 -30955.38

Prediction, e.g. with AR(6)

# Prediction, e.g. with AR(6)
x = arima(T.outside, c(6,0,0))

pltpred = function(xlim, Title) {
  plot(T.outside, xlim = xlim, type='l')
  start = nrow(meteo) + 1
  pr = predict(x, n.ahead = xlim[2] - start + 1)
  lines(start:xlim[2], pr$pred, col='red')
  lines(start:xlim[2], pr$pred+2*pr$se, col='green')
  lines(start:xlim[2], pr$pred-2*pr$se, col='green')
  title(Title)
}

Prediction, e.g. with AR(6) - 10 minutes

Prediction of Anomaly, e.g. with AR(6) - 10 minutes

Prediction, e.g. with AR(6) - 110 minutes

Prediction of Anomaly, e.g. with AR(6) - 110 minutes

Prediction, e.g. with AR(6) - 1 day

Prediction of Anomaly, e.g. with AR(6) - 1 day

Prediction, e.g. with AR(6) - 1 week

Prediction of Anomaly, e.g. with AR(6) - 1 week

Predicting 1 week: periodic trend + ar(6) for anomaly

Now that we have an idea of the trend and the anomaly, we can combine them to make predictions

plot(T.outside,xlim=c(1,19970), type='l')
x.an = arima(an, c(6,0,0)) # model the anomaly by AR(6)
x.pr = as.numeric(predict(x.an, 10080)$pred) 
x.se = as.numeric(predict(x.an, 10080)$se)
hours.all = c(meteo$hours, max(meteo$hours) + (1:10080)/60)
T.per = 18.2-4.9*sin(pi*(hours.all+1.6)/12)
lines(T.per, col = 'blue')
hours.pr = c(max(meteo$hours) + (1:10080)/60)
T.pr = 18.2-4.9*sin(pi*(hours.pr+1.6)/12)
lines(9891:19970, T.pr+x.pr, col='red')
lines(9891:19970, T.pr+x.pr+2*x.se, col='green')
lines(9891:19970, T.pr+x.pr-2*x.se, col='green')
title("predicting 1 week: periodic trend + ar(6) for anomaly")

Predicting 1 week: periodic trend + ar(6) for anomaly

Simulation with and without trend

We can also use these models to simulate data

x = arima(T.outside, c(6,0,0))
plot(T.pr + arima.sim(list(ar = x.an$coef[1:6]), 10080, sd = sqrt(0.002556)), 
    col = 'red', ylab="Temperature")
lines(18+arima.sim(list(ar = x$coef[1:6]), 10080, sd=sqrt(0.002589)), 
    col = 'blue')
title("red: with trend, blue: without trend")

What can we learn from this?

Prediction/forecasting:

  • AR(6) prediction is a compromise between the end of the series and the trend
  • the closer we are to observations, the more similar the prediction is to the nearest (last) observation
  • further in the future the prediction converges to the trend
  • the more useful (realistic) the trend is, the more realistic the far-into-the-future prediction becomes
  • the standard error of prediction increases when predictions are further in the future.

A side note: fitting a phase with a linear model

A phase shift model \(\alpha \sin(x + \phi)\) can also be modelled by
\(\alpha sin(x) + \beta cos(x)\); this is essentially a re-parameterization. Try the following code:

x = seq(0, 4*pi, length.out = 200) # x plotting range

f = function(dir, x) { # plot the combination of a sin+cos function, based on dir
    a = sin(dir)
    b = cos(dir)
    # three curves:
    plot(a * sin(x) + b * cos(x) ~ x, type = 'l', asp=1, col = 'green')
    lines(x, a * sin(x), col = 'red')
    lines(x, b * cos(x), col = 'blue')
    # legend:
    lines(c(10, 10+a), c(2,2), col = 'red')
    lines(c(10, 10), c(2,2+b), col = 'blue')
    arrows(x0 = 10, x1 = 10+a, y0 = 2, y1 = 2+b, .1, col = 'green')
    title("red: sin(x), blue: cos(x), green: sum")
}

for (d in seq(0, 2*pi, length.out = 100)) {
    f(d, x)
    Sys.sleep(.1) # give us some time to think this over!
}

A side note: fitting a phase with a linear model

So, we can fit the same model by a different parameterization:

[1] 108956.1
$minimum
[1] 108956.1

$estimate
[1] 18.189544 -4.478379 -2.000148

$gradient
[1] -0.0003296063 -0.0024207812 -0.0012586475

$code
[1] 1

$iterations
[1] 5

A side note: fitting a phase with a linear model

…which is a linear model, identical to:


Call:
lm(formula = T.outside ~ sin(tt) + cos(tt))

Coefficients:
(Intercept)      sin(tt)      cos(tt)  
     18.190       -4.478       -2.000  

Next: how do we fit coefficients?

Sidetrack: time series analysis for satellite-based land use change detection

  • Satellites continusouly observe the surface of the earth
  • How can we use this information to detect and monitor changes like deforestration?
  • Example dataset: MODIS 16 day vegetation index measaurements over the Brazilian amazon

A time-series of images

The EVI (enhanced vegetation index) is a one-dimensional index computed from near infrared, red, and blue reflection measurements. Values can be interpreted as “greenness” and range from -1 to 1.

A time-series of images

For each of the \(1200^2\) pixels, we interpret the set of observations as time series.

Consider the general model: \[y_t = T_t + S_t + R_t\]

  • \(T_t\) describes the trend
  • \(S_t\) describes seasonal effects
  • \(R_t\) describes remaining variation

We could be interested in both, changes in the trend and seasonality.

Example analysis

An example series (1 pixel in forest area):

Example analysis

We compute differences from the overall mean to center the series around 0.

Example analysis

We compute differences from the overall mean to center the series around 0.

Example analysis

We assume yearly seasonality and build seasonal sub-series:

\[ \begin{aligned} y_t^{(16)} &= (y_{2005,16},y_{2006,16},y_{2007,16},y_{2008,16}, ...) \\ y_t^{(32)} &= (y_{2005,32},y_{2006,32},y_{2007,32},y_{2008,32}, ...) \\ ... \end{aligned} \] for which we simply compute mean values.

Example analysis

After subtracting the seasonal component, the series looks like:

Example analysis

After removing large parts of seasonality, we can study the trend of the data. Deforestation should be visible by abrupt changes in the trend structure. To test, whether there is a structural change, we assume the following linear regression model \[T_t = \beta_0 + \beta_1 t\] and hypothesize that the regression coefficients \(\beta_0, \beta_1\) do not change in time.

Example analysis

Fitting a global trend leads to:

Example analysis

Looking at the cumulated sum of the model’s residuals shows that they do not fluctuate around 0, which we expected in our hypothesis.

By thresholding the cumulated residuals based on a given confidence level, we can find a breakpoint around t=107.

Example analysis

A better model:

Example analysis

Literature (sidetrack)

  1. Verbesselt, J., Hyndman, R., Newnham, G., & Culvenor, D. (2010). Detecting trend and seasonal changes in satellite image time series. Remote Sensing of Environment, 114, 106-115. DOI: 10.1016/j.rse.2009.08.014.

  2. R. B. Cleveland, W. S. Cleveland, J.E. McRae, and I. Terpenning (1990) STL: A Seasonal-Trend Decomposition Procedure Based on Loess. Journal of Official Statistics, 6, 3-73.

  3. W. S. Cleveland, E. Grosse and W. M. Shyu (1992) Local regression models. Chapter 8 of Statistical Models in S eds J.M. Chambers and T.J. Hastie, Wadsworth and Brooks/Cole.